Fast Cache for Your Text: Accelerating Exact Pattern Matching with Feed-Forward Bloom Filters

نویسندگان

  • Iulian Moraru
  • David G. Andersen
چکیده

This paper presents an algorithm for exact pattern matching based on a new type of Bloom filter that we call a feed-forward Bloom filter. Besides filtering the input corpus, a feed-forward Bloom filter is also able to reduce the set of patterns needed for the exact matching phase. We show that this technique, along with a CPU architecture aware design of the Bloom filter, can provide speedups between 2× and 30×, and memory consumption reductions as large as 50× when compared with grep, while the filtering speed can be as much as 5× higher than that of a normal Bloom filters. This research was supported by grants from the National Science Foundation, Google, Network Appliance, Intel Corporation and Carnegie Mellon Cylab.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SigMatch: Fast and Scalable Multi-Pattern Matching

Multi-pattern matching involves matching a data item against a large database of “signature” patterns. Existing algorithms for multipattern matching do not scale well as the size of the signature database increases. In this paper, we present sigMatch – a fast, versatile, and scalable technique for multi-pattern signature matching. At its heart, sigMatch organizes the signature database into a (...

متن کامل

Data Caching in Ad Hoc Networks using Bloom Filters

Data caching provides efficient data access by maintaining replicas of data in strategic parts of the network. However, current research in this area does not manage memory space of each node efficiently. We propose an improvement by considering Bloom filters, a fast, spaceefficient probabilistic method for looking up data. We compare the system the system performance with and without Bloom fil...

متن کامل

Accelerating Boolean Matching Using Bloom Filter

Boolean matching is a fundamental problem in FPGA synthesis, but existing Boolean matchers are not scalable to complex PLBs (programmable logic blocks) and large circuits. This paper proposes a filter-based Boolean matching method, F-BM, which accelerates Boolean matching using lookup tables implemented by Bloom filters storing precalculated matching results. To show the effectiveness of the pr...

متن کامل

Cache Efficient Bloom Filters for Shared Memory Machines

Bloom filters are a well known data-structure that supports approximate set membership queries that report no false negatives. Each element in the universe represented by the bloom filter is associated with k random bits in the structure. Traditional bloom filters, therefore, require k non-local memory operations to insert an element or perform a lookups. For very large bloom filters, these k l...

متن کامل

Fast and Scalable Pattern Matching for Memory Architecture

Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. We present a hardware-implementabl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009